Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add row_count() to count specific values row-wise #553

Merged
merged 13 commits into from
Oct 11, 2024
Merged

Conversation

strengejacke
Copy link
Member

@strengejacke strengejacke commented Oct 10, 2024

@etiennebacher not sure if you think exact is the best argument name, I'm open for any better suggestions.

@etiennebacher
Copy link
Member

etiennebacher commented Oct 10, 2024

I feel I'm always the one reluctant to add new functions here (sorry about that 😅) but this feels like a very narrow usecase, especially since it's possible to do it in a few lines (unless I'm missing sth):

library(datawizard)

dat <- data.frame(
  c1 = c("1", "2", NA, "3"),
  c2 = c(NA, "2", NA, "3"),
  c3 = c(NA, 4, NA, NA),
  c4 = c(2, 3, 7, Inf)
)

data_modify(
  dat,
  count_2 = rowSums(dat == 2, na.rm = TRUE),
  count_3 = rowSums(dat == 3, na.rm = TRUE)
)
#>     c1   c2 c3  c4 count_2 count_3
#> 1    1 <NA> NA   2       1       0
#> 2    2    2  4   3       2       1
#> 3 <NA> <NA> NA   7       0       0
#> 4    3    3 NA Inf       0       2

I haven't looked at the other PRs on row_ yet, but maybe it would be more efficient to allow rowwise operations rather than implementing several row_ functions? I don't think this would be easy though.

@strengejacke
Copy link
Member Author

(unless I'm missing sth)

Yes, you're missing something :-)

library(datawizard)

dat <- data.frame(
  c1 = c("1", "2", NA, "3"),
  c2 = c(NA, "2", NA, "3"),
  c3 = c(NA, 4, NA, NA),
  c4 = c(2, 3, 7, Inf)
)

data_modify(
  dat,
  count__2 = rowSums(dat == 2, na.rm = TRUE),
  count_string_2 = rowSums(dat == "2", na.rm = TRUE)
)
#>     c1   c2 c3  c4 count__2 count_string_2
#> 1    1 <NA> NA   2        1              1
#> 2    2    2  4   3        2              2
#> 3 <NA> <NA> NA   7        0              0
#> 4    3    3 NA Inf        0              0

row_count(dat, count = 2)
#> [1] 1 2 0 0
row_count(dat, count = "2")
#> [1] 1 2 0 0
row_count(dat, count = 2, exact = TRUE)
#> [1] 1 0 0 0
row_count(dat, count = "2", exact = TRUE)
#> [1] 0 2 0 0

@strengejacke
Copy link
Member Author

strengejacke commented Oct 10, 2024

but this feels like a very narrow usecase

Not that seldom use case if you work with single items that you want to turn into scales (something we do often)

maybe it would be more efficient to allow rowwise operations

Where would you add a feature for rowwise operations? And: row_sums()/row_means() have the benefit of doing rowwise operations for a minimum amount of valid values, not sure this is possible with other functions?

Copy link
Member

@etiennebacher etiennebacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I should have read the changes before commenting ;)

A few comments

NEWS.md Outdated Show resolved Hide resolved
R/row_count.R Outdated Show resolved Hide resolved
R/row_count.R Outdated Show resolved Hide resolved
R/row_count.R Outdated Show resolved Hide resolved
R/row_count.R Outdated Show resolved Hide resolved
R/row_count.R Show resolved Hide resolved
R/row_count.R Outdated Show resolved Hide resolved
tests/testthat/test-row_count.R Show resolved Hide resolved
@strengejacke
Copy link
Member Author

Great! What about the argument name exact? (See my initial comment)
I thought an alternative could be type_safe?

@etiennebacher
Copy link
Member

I think allow_coercion = TRUE/FALSE would be clearer

R/row_count.R Outdated Show resolved Hide resolved
@etiennebacher etiennebacher changed the title Draft row_count() Add row_count() to count specific values row-wise Oct 11, 2024
@etiennebacher
Copy link
Member

Thanks!

@etiennebacher etiennebacher merged commit 5ce207b into main Oct 11, 2024
22 checks passed
@etiennebacher etiennebacher deleted the row_count branch October 11, 2024 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants